test: expand agent harness coverage#513
Conversation
- Introduced a suite of tests for the `fire_hooks` function, ensuring that all hooks are dispatched even if one fails, enhancing reliability in the hook execution flow. - Added tests for the `sanitize_tool_output` function to validate the correct mapping of success and failure messages for various tool outputs. - Implemented a mock memory structure to facilitate testing of memory context building, ensuring that working memory is prioritized and deduplication occurs correctly. - Enhanced the `build_memory_context` function tests to verify filtering and truncation of memory entries, improving the robustness of memory management in the system. - Overall, these tests aim to strengthen the codebase by ensuring that critical functionalities related to hooks and memory context are thoroughly validated.
…mpt options - Updated the `provider_alias_and_route_selection_round_trip` test to dynamically resolve the first provider from the registry, ensuring accurate alias resolution. - Added new tests for `DumpPromptOptions` and `ComposioStubTools`, validating default settings and expected tool names. - Introduced tests for rendering main agent dumps, ensuring tool instructions and skill counts are correctly included in the output. - Enhanced prompt handling tests to cover cache boundary extraction and subagent render options, improving overall test coverage and reliability.
- Added new tests for `AgentError` variants to validate string formatting and error recovery from `anyhow`. - Improved formatting consistency in existing tests for better readability. - Enhanced the `sanitize_tool_output` test to ensure accurate mapping of success and failure messages. - Updated memory loader tests to enforce minimum limits and budget constraints, ensuring robust memory management. - Overall, these changes aim to strengthen the test suite by improving coverage and clarity in error handling scenarios.
…ering - Introduced new tests to validate the filtering of tools based on named scopes and disallowed tools, ensuring correct tool selection in debug dumps. - Added tests for rendering subagent dumps, including handling file prompt fallbacks and missing files without panicking. - Enhanced the workspace prompt handling to prefer custom prompt locations, improving the flexibility and reliability of agent prompt rendering. - Overall, these additions strengthen the test suite by covering critical functionalities related to tool filtering and prompt rendering.
…modal helpers - Introduced new tests for the memory loader to validate behavior when the header exceeds budget and when recall fails, ensuring robust memory management. - Added tests for multimodal helpers, covering image marker counting, payload extraction, and MIME type normalization, enhancing the reliability of multimodal interactions. - Overall, these additions strengthen the test suite by improving coverage and ensuring correct functionality in memory handling and multimodal processing.
- Introduced new tests for the `run_tool_call_loop` function, validating the rejection of vision markers for non-vision providers and ensuring correct streaming of final text chunks. - Added tests to verify that CLI-only tools are blocked in prompt mode and that native tool results are persisted as tool messages. - Enhanced the `Agent` class with tests for error handling and event publishing during single runs, ensuring robust agent behavior in various scenarios. - Overall, these additions strengthen the test suite by improving coverage and reliability in tool execution and agent interactions.
- Added new tests for the `run_tool_call_loop` function, including scenarios for auto-approving supervised tools on non-CLI channels and handling unknown tools with default max iterations. - Introduced `ErrorResultTool` and `FailingTool` to simulate error conditions during tool execution, improving coverage of error handling in the agent. - Updated the `ScriptedProvider` to return results wrapped in `anyhow::Result`, ensuring consistent error handling across test cases. - Overall, these enhancements strengthen the test suite by validating tool execution paths and agent behavior under various conditions.
- Enhanced formatting in various test cases for better clarity and consistency, including the use of line breaks and indentation. - Updated assertions to improve readability by aligning them with Rust's idiomatic style. - Overall, these changes aim to strengthen the test suite by making it more maintainable and easier to understand.
- Added comprehensive documentation to the `mod.rs` file, detailing the agent domain's purpose, key components, and their functionalities. - Improved clarity on how LLMs interact with the system, manage conversation history, and handle autonomous behaviors. - This update aims to enhance understanding and maintainability of the agent domain within the OpenHuman project.
- Enhanced comments and documentation in `dispatcher.rs`, `error.rs`, `hooks.rs`, and `harness/mod.rs` for better clarity and consistency. - Adjusted formatting in the `TurnContext` and `ToolCallRecord` structs to improve readability and understanding of their purpose and functionality. - Overall, these changes aim to strengthen the documentation and maintainability of the agent domain within the OpenHuman project.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR significantly expands test coverage across the agent orchestration, learning, and channel systems while refining public APIs and documentation. Key changes include expanding the Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes This PR introduces substantial heterogeneous changes across ~40 files: trait interface expansions ( Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…vior - Introduced new tests for the `DefaultMemoryLoader`, validating error propagation during primary recall failures and ensuring correct context emission when working memory is present. - Added tests to verify behavior when the working memory section exceeds budget constraints, enhancing memory management robustness. - Overall, these additions strengthen the test suite by improving coverage and reliability in memory handling and agent interactions.
…tests - Reformatted the `load_context` calls in memory loader tests for better readability by chaining method calls. - Enhanced documentation comments in the `pformat.rs` file to clarify the purpose and functionality of functions. - Improved consistency in spacing and formatting across various sections of the codebase, including the `interrupt.rs` and `builder.rs` files. - Overall, these changes aim to enhance code clarity and maintainability within the test suite and related modules.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/openhuman/agent/bus.rs (1)
361-373:⚠️ Potential issue | 🟠 MajorGuard global-handler registration test with
BUS_HANDLER_LOCKto avoid race conditions.Line 367 calls
register_agent_handlers()directly without acquiring the lock used by other bus tests (e.g.,mock_agent_run_turn()). This can race with tests that install handler stubs, causing nondeterministic test outcomes.Suggested fix
#[tokio::test] async fn register_agent_handlers_exposes_run_turn_on_global_registry() { // Read-only smoke test: prove the production registration path // actually puts `agent.run_turn` on the global registry. Does // NOT dispatch — dispatching from this test would race with any // other test that installs a handler override (e.g. the channel // dispatch integration tests in `runtime_dispatch.rs`). - register_agent_handlers(); + let _guard = use_real_agent_handler().await; let registry = crate::core::event_bus::native_registry() .expect("native registry should be initialized after register_agent_handlers");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/bus.rs` around lines 361 - 373, Wrap the test's call to register_agent_handlers() with the same global test lock used elsewhere to prevent races: acquire BUS_HANDLER_LOCK at the start of the async test (before calling register_agent_handlers()) and hold it until after the registry assertion completes, then release it; this mirrors other tests like mock_agent_run_turn() and ensures the AGENT_RUN_TURN_METHOD registration check runs under the BUS_HANDLER_LOCK to avoid nondeterministic handler-install races.src/openhuman/agent/triage/routing.rs (1)
391-446:⚠️ Potential issue | 🟠 MajorSerialize cache-mutating tests with a lock to prevent race-induced flakiness at line 422.
The test at line 407 (
decide_with_cache_respects_ttl_window) primes the cache withCacheState::Degradedand expects it to be returned within the TTL window. However, line 422 accepts eitherDegradedorRemote, which masks a race where a concurrent test clears the cache between setup and assertion. Since tests run in parallel by default and access the shared staticDECISION_CACHE, the assertion should be strict:assert_eq!(state, CacheState::Degraded).Fix: Add a test-scoped
Mutexlock to serialize the three tokio tests that accessDECISION_CACHE:✅ Deterministic test fix
#[cfg(test)] mod tests { use super::*; + static CACHE_TEST_LOCK: tokio::sync::Mutex<()> = tokio::sync::Mutex::const_new(()); #[tokio::test] async fn mark_degraded_forces_remote_on_next_resolve() { + let _guard = CACHE_TEST_LOCK.lock().await; clear_cache().await; mark_degraded().await;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/triage/routing.rs` around lines 391 - 446, The three tests that mutate DECISION_CACHE (mark_degraded_forces_remote_on_next_resolve, decide_with_cache_respects_ttl_window, cache_snapshot_returns_none_when_empty_and_refreshes_expired_entries) can race; serialize them by introducing a test-scoped async mutex and acquiring it at the start of each test to hold for the test duration, and tighten the assertion in decide_with_cache_respects_ttl_window to assert_eq!(state, CacheState::Degraded) (replace the current permissive matches! check). Use a single shared symbol like TEST_DECISION_CACHE_LOCK (a tokio::sync::Mutex<()> created with once_cell::sync::Lazy or once_cell::sync::OnceLock) and in each test do let _guard = TEST_DECISION_CACHE_LOCK.lock().await before calling clear_cache / priming DECISION_CACHE / decide_with_cache so the cache setup and assertion cannot be interleaved by other tests.
🧹 Nitpick comments (6)
src/openhuman/channels/commands.rs (1)
307-308: Simplify redundant default initialization in test setup.
Config::default()already provides defaultchannels_config, so the explicit reassignment can be removed.Proposed simplification
- let mut config = Config::default(); - config.channels_config = crate::openhuman::config::ChannelsConfig::default(); + let config = Config::default(); doctor_channels(config).await.unwrap();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/channels/commands.rs` around lines 307 - 308, The test creates a Config via Config::default() then immediately overwrites config.channels_config with ChannelsConfig::default(), which is redundant because Config::default() already sets channels_config; remove the explicit reassignment line (the assignment to config.channels_config = crate::openhuman::config::ChannelsConfig::default()) and rely on Config::default() to provide the default ChannelsConfig in the test setup for functions using Config.src/openhuman/agent/multimodal.rs (2)
629-632: Use a tempdir-derived missing path for platform-safe determinism.Line 629 currently uses a hardcoded absolute path; generating a missing path from
tempfile::tempdir()is more reliable across CI environments.Refactor suggestion
- let err = normalize_local_image("/definitely/missing.png", 1024) + let temp = tempfile::tempdir().unwrap(); + let missing = temp.path().join("missing.png"); + let err = normalize_local_image(missing.to_string_lossy().as_ref(), 1024) .await .expect_err("missing local file should fail");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/multimodal.rs` around lines 629 - 632, The test uses a hardcoded absolute path; replace it with a path derived from a tempfile::tempdir() so the "missing" file is platform-safe and deterministic: create a TempDir via tempfile::tempdir(), join a non-existent filename (e.g., temp_dir.path().join("definitely_missing.png")) and pass that path to normalize_local_image(...) in the test; keep the expect_err assertion and the check that err.to_string() contains "not found or unreadable" so the intent of normalize_local_image(...) is preserved.
612-614: Avoid exact full error-string equality in tests.Line 612-Line 614 asserts the entire formatted error text, which is brittle to harmless wording changes.
Refactor suggestion
- assert_eq!( - validate_mime("x", "text/plain").unwrap_err().to_string(), - "multimodal image MIME type is not allowed for 'x': text/plain" - ); + let err = validate_mime("x", "text/plain").unwrap_err().to_string(); + assert!(err.contains("MIME type is not allowed")); + assert!(err.contains("text/plain"));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/multimodal.rs` around lines 612 - 614, The test currently asserts exact equality of the formatted error string from validate_mime("x", "text/plain"), which is brittle; change the assertion to check for a stable substring or error kind instead — for example call let err = validate_mime("x", "text/plain").unwrap_err().to_string(); then assert!(err.contains("multimodal image MIME type is not allowed") && err.contains("'x'")) or, if the error type is exposed, match on the error variant from validate_mime instead of comparing the full Display output; this keeps the test robust while still verifying the expected failure.src/openhuman/context/debug_dump.rs (1)
754-785: Consider using RAII for test workspace cleanup.Several tests create temporary workspaces but use
let _ = std::fs::remove_dir_all(workspace);for cleanup, which won't execute if the test panics. Consider using thetempfilecrate'sTempDirfor automatic cleanup, or at minimum document that manual cleanup may leave directories on panic.♻️ Example with tempfile crate
+use tempfile::TempDir; + #[test] fn render_main_agent_dump_includes_tool_instructions_and_skill_count() { - let workspace = - std::env::temp_dir().join(format!("openhuman_debug_main_{}", uuid::Uuid::new_v4())); - std::fs::create_dir_all(&workspace).unwrap(); + let temp_dir = TempDir::new().unwrap(); + let workspace = temp_dir.path(); std::fs::write(workspace.join("SOUL.md"), "# Soul\nContext").unwrap(); // ... rest of test - let _ = std::fs::remove_dir_all(workspace); + // TempDir automatically cleans up on drop }Also applies to: 817-858, 860-902, 904-961
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/context/debug_dump.rs` around lines 754 - 785, Tests like render_main_agent_dump_includes_tool_instructions_and_skill_count create temporary workspace dirs and call std::fs::remove_dir_all at the end, which won't run on panic; update these tests to use tempfile::TempDir (or similar RAII temp-dir helper) so the directory is created via TempDir::new() and files are written to temp_dir.path(), removing the manual std::fs::remove_dir_all calls; apply this change to the test above and the other tests referencing the same pattern (those that call render_main_agent_dump, use StubTool and ToolCategory, and manually remove the workspace).src/openhuman/agent/bus.rs (1)
30-41: Remove duplicated rustdoc block forAgentTurnRequest.The same doc text appears twice consecutively, which adds noise and makes maintenance harder.
✂️ Suggested cleanup
-/// Full owned payload for a single agentic turn executed through the bus. -/// -/// All fields are either owned values, [`Arc`]s, or channel handles — the -/// bus carries them by value without touching serialization. Consumers can -/// therefore pass trait objects (`Arc<dyn Provider>`, tool trait-object -/// registries) and streaming senders (`on_delta`) through unchanged. /// Full owned payload for a single agentic turn executed through the bus. /// /// All fields are either owned values, [`Arc`]s, or channel handles — the /// bus carries them by value without touching serialization. Consumers can /// therefore pass trait objects (`Arc<dyn Provider>`, tool trait-object /// registries) and streaming senders (`on_delta`) through unchanged.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/bus.rs` around lines 30 - 41, The rustdoc comment for AgentTurnRequest is duplicated; remove the redundant copy so only a single doc block precedes the AgentTurnRequest definition, ensuring the remaining doc text is unchanged and whitespace/formatting around the struct/type declaration is preserved.src/openhuman/agent/harness/session/runtime.rs (1)
491-492: Avoid fixed sleep for async event assertions.At Line 491, a fixed 20ms delay can be flaky under CI load. Prefer waiting on a bounded condition (
timeout+ loop) before asserting captured events.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/session/runtime.rs` around lines 491 - 492, Replace the fixed sleep+lock with a bounded wait loop using tokio::time::timeout: repeatedly acquire the events Mutex (events.lock().await) and check the desired condition on captured (e.g., non-empty or expected length) until it is satisfied or the timeout elapses, returning a test failure or panic when timeout occurs; use tokio::time::Duration for the overall timeout and a short async yield (or await on the Mutex again) inside the loop to avoid busy-waiting so the assertion is robust under CI.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/agent/harness/mod.rs`:
- Around line 2-5: Normalize the module-level doc comment spacing in the harness
header: remove the empty doc-comment line after the first //! and ensure each
leading //! has a single space then the sentence (no trailing spaces), and apply
the same fix to the duplicated header region (lines referenced in the review as
the block and lines 13-14); update the module-level doc comment in
src/openhuman/agent/harness/mod.rs so all //! lines are consistently formatted
to satisfy Prettier --check.
In `@src/openhuman/agent/memory_loader.rs`:
- Around line 226-230: The test name
new_enforces_minimum_limit_and_zero_budget_disables_output is misleading because
the assertion on DefaultMemoryLoader::new(0, 0.4).with_max_chars(0) and
load_context(&MockMemory, "hello") expects the memory context to be present;
rename the test function to reflect that zero budget still produces output
(e.g., new_enforces_minimum_limit_and_zero_budget_still_outputs_context) so the
name matches the behavior verified by load_context and the assert on "[Memory
context]".
In `@src/openhuman/agent/triage/escalation.rs`:
- Around line 226-227: Replace the timing-dependent
sleep(Duration::from_millis(20)).await calls with condition-based waits: instead
of sleeping, poll the actual condition you care about (e.g. a shared state,
queue length, or an async flag) inside a timeout using tokio::time::timeout and
either a small awaitable backoff (tokio::task::yield_now().await) or await a
synchronization primitive (tokio::sync::Notify, watch, or Barrier) that your
code under test signals; update the test(s) in escalation.rs that call
sleep(...) to wait for the concrete predicate (wrapped in timeout) so assertions
become deterministic and CI-safe.
In `@src/openhuman/agent/triage/events.rs`:
- Line 89: Replace the brittle fixed wait call
sleep(Duration::from_millis(20)).await with a bounded wait-until loop that polls
the event capture predicate and fails on timeout: use tokio::time::timeout (or
Instant+deadline) and repeatedly check the captured events/count/content in a
small sleep loop until the predicate is met or the timeout elapses, then assert
the expected events; this removes the race and makes the test deterministic
(locate the exact call sleep(Duration::from_millis(20)).await in the test and
replace it with the timeout+predicate loop around the captured events).
---
Outside diff comments:
In `@src/openhuman/agent/bus.rs`:
- Around line 361-373: Wrap the test's call to register_agent_handlers() with
the same global test lock used elsewhere to prevent races: acquire
BUS_HANDLER_LOCK at the start of the async test (before calling
register_agent_handlers()) and hold it until after the registry assertion
completes, then release it; this mirrors other tests like mock_agent_run_turn()
and ensures the AGENT_RUN_TURN_METHOD registration check runs under the
BUS_HANDLER_LOCK to avoid nondeterministic handler-install races.
In `@src/openhuman/agent/triage/routing.rs`:
- Around line 391-446: The three tests that mutate DECISION_CACHE
(mark_degraded_forces_remote_on_next_resolve,
decide_with_cache_respects_ttl_window,
cache_snapshot_returns_none_when_empty_and_refreshes_expired_entries) can race;
serialize them by introducing a test-scoped async mutex and acquiring it at the
start of each test to hold for the test duration, and tighten the assertion in
decide_with_cache_respects_ttl_window to assert_eq!(state, CacheState::Degraded)
(replace the current permissive matches! check). Use a single shared symbol like
TEST_DECISION_CACHE_LOCK (a tokio::sync::Mutex<()> created with
once_cell::sync::Lazy or once_cell::sync::OnceLock) and in each test do let
_guard = TEST_DECISION_CACHE_LOCK.lock().await before calling clear_cache /
priming DECISION_CACHE / decide_with_cache so the cache setup and assertion
cannot be interleaved by other tests.
---
Nitpick comments:
In `@src/openhuman/agent/bus.rs`:
- Around line 30-41: The rustdoc comment for AgentTurnRequest is duplicated;
remove the redundant copy so only a single doc block precedes the
AgentTurnRequest definition, ensuring the remaining doc text is unchanged and
whitespace/formatting around the struct/type declaration is preserved.
In `@src/openhuman/agent/harness/session/runtime.rs`:
- Around line 491-492: Replace the fixed sleep+lock with a bounded wait loop
using tokio::time::timeout: repeatedly acquire the events Mutex
(events.lock().await) and check the desired condition on captured (e.g.,
non-empty or expected length) until it is satisfied or the timeout elapses,
returning a test failure or panic when timeout occurs; use tokio::time::Duration
for the overall timeout and a short async yield (or await on the Mutex again)
inside the loop to avoid busy-waiting so the assertion is robust under CI.
In `@src/openhuman/agent/multimodal.rs`:
- Around line 629-632: The test uses a hardcoded absolute path; replace it with
a path derived from a tempfile::tempdir() so the "missing" file is platform-safe
and deterministic: create a TempDir via tempfile::tempdir(), join a non-existent
filename (e.g., temp_dir.path().join("definitely_missing.png")) and pass that
path to normalize_local_image(...) in the test; keep the expect_err assertion
and the check that err.to_string() contains "not found or unreadable" so the
intent of normalize_local_image(...) is preserved.
- Around line 612-614: The test currently asserts exact equality of the
formatted error string from validate_mime("x", "text/plain"), which is brittle;
change the assertion to check for a stable substring or error kind instead — for
example call let err = validate_mime("x",
"text/plain").unwrap_err().to_string(); then assert!(err.contains("multimodal
image MIME type is not allowed") && err.contains("'x'")) or, if the error type
is exposed, match on the error variant from validate_mime instead of comparing
the full Display output; this keeps the test robust while still verifying the
expected failure.
In `@src/openhuman/channels/commands.rs`:
- Around line 307-308: The test creates a Config via Config::default() then
immediately overwrites config.channels_config with ChannelsConfig::default(),
which is redundant because Config::default() already sets channels_config;
remove the explicit reassignment line (the assignment to config.channels_config
= crate::openhuman::config::ChannelsConfig::default()) and rely on
Config::default() to provide the default ChannelsConfig in the test setup for
functions using Config.
In `@src/openhuman/context/debug_dump.rs`:
- Around line 754-785: Tests like
render_main_agent_dump_includes_tool_instructions_and_skill_count create
temporary workspace dirs and call std::fs::remove_dir_all at the end, which
won't run on panic; update these tests to use tempfile::TempDir (or similar RAII
temp-dir helper) so the directory is created via TempDir::new() and files are
written to temp_dir.path(), removing the manual std::fs::remove_dir_all calls;
apply this change to the test above and the other tests referencing the same
pattern (those that call render_main_agent_dump, use StubTool and ToolCategory,
and manually remove the workspace).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: eee59033-9afc-445a-b18d-26dbe8420cc6
📒 Files selected for processing (28)
src/openhuman/agent/bus.rssrc/openhuman/agent/dispatcher.rssrc/openhuman/agent/error.rssrc/openhuman/agent/harness/memory_context.rssrc/openhuman/agent/harness/mod.rssrc/openhuman/agent/harness/session/runtime.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/session/types.rssrc/openhuman/agent/harness/tool_loop.rssrc/openhuman/agent/hooks.rssrc/openhuman/agent/host_runtime.rssrc/openhuman/agent/memory_loader.rssrc/openhuman/agent/mod.rssrc/openhuman/agent/multimodal.rssrc/openhuman/agent/schemas.rssrc/openhuman/agent/triage/escalation.rssrc/openhuman/agent/triage/events.rssrc/openhuman/agent/triage/routing.rssrc/openhuman/channels/bus.rssrc/openhuman/channels/commands.rssrc/openhuman/channels/context.rssrc/openhuman/channels/routes.rssrc/openhuman/context/debug_dump.rssrc/openhuman/context/prompt.rssrc/openhuman/learning/prompt_sections.rssrc/openhuman/learning/reflection.rssrc/openhuman/learning/tool_tracker.rssrc/openhuman/learning/user_profile.rs
| sleep(Duration::from_millis(20)).await; | ||
|
|
There was a problem hiding this comment.
Replace fixed sleeps with condition-based waits in async tests.
Using sleep(Duration::from_millis(20)) creates timing-dependent assertions and can intermittently fail in slower CI runs.
🛠️ Suggested deterministic pattern
- use tokio::time::{sleep, Duration};
+ use tokio::time::{timeout, Duration};
@@
- sleep(Duration::from_millis(20)).await;
+ timeout(Duration::from_secs(1), async {
+ loop {
+ if seen.lock().await.iter().any(|event| matches!(
+ event,
+ DomainEvent::TriggerEvaluated { decision, external_id, .. }
+ if decision == "drop" && external_id == "esc-1"
+ )) {
+ break;
+ }
+ tokio::task::yield_now().await;
+ }
+ })
+ .await
+ .expect("timed out waiting for evaluated event");
@@
- sleep(Duration::from_millis(20)).await;
+ timeout(Duration::from_secs(1), async {
+ loop {
+ if seen.lock().await.iter().any(|event| matches!(
+ event,
+ DomainEvent::TriggerEvaluated { decision, external_id, .. }
+ if decision == "acknowledge" && external_id == "esc-1"
+ )) {
+ break;
+ }
+ tokio::task::yield_now().await;
+ }
+ })
+ .await
+ .expect("timed out waiting for evaluated event");Also applies to: 261-262
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/triage/escalation.rs` around lines 226 - 227, Replace the
timing-dependent sleep(Duration::from_millis(20)).await calls with
condition-based waits: instead of sleeping, poll the actual condition you care
about (e.g. a shared state, queue length, or an async flag) inside a timeout
using tokio::time::timeout and either a small awaitable backoff
(tokio::task::yield_now().await) or await a synchronization primitive
(tokio::sync::Notify, watch, or Barrier) that your code under test signals;
update the test(s) in escalation.rs that call sleep(...) to wait for the
concrete predicate (wrapped in timeout) so assertions become deterministic and
CI-safe.
| publish_escalated(&envelope, "trigger_reactor"); | ||
| publish_failed(&envelope, "boom"); | ||
|
|
||
| sleep(Duration::from_millis(20)).await; |
There was a problem hiding this comment.
Avoid fixed-delay waiting in the event-bus test.
Line 89 uses a fixed 20ms sleep, which can race with scheduler timing and intermittently fail on busy runners. Prefer a timeout + predicate loop on captured event count/content.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/triage/events.rs` at line 89, Replace the brittle fixed
wait call sleep(Duration::from_millis(20)).await with a bounded wait-until loop
that polls the event capture predicate and fails on timeout: use
tokio::time::timeout (or Instant+deadline) and repeatedly check the captured
events/count/content in a small sleep loop until the predicate is met or the
timeout elapses, then assert the expected events; this removes the race and
makes the test deterministic (locate the exact call
sleep(Duration::from_millis(20)).await in the test and replace it with the
timeout+predicate loop around the captured events).
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/openhuman/agent/memory_loader.rs (1)
226-230:⚠️ Potential issue | 🟡 MinorRename this test to match the asserted behavior.
Line 226 says “zero budget disables output”, but Line 229 asserts output is present. This makes the test intent misleading.
Proposed rename
- async fn new_enforces_minimum_limit_and_zero_budget_disables_output() { + async fn new_enforces_minimum_limit_and_zero_budget_is_unlimited() {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/memory_loader.rs` around lines 226 - 230, The test function new_enforces_minimum_limit_and_zero_budget_disables_output is misnamed because it asserts that output is present; rename it to reflect the asserted behavior (e.g., new_enforces_minimum_limit_and_zero_budget_allows_output) and keep the existing test body that constructs DefaultMemoryLoader::new(0, 0.4).with_max_chars(0), calls load_context(&MockMemory, "hello"), and asserts context.contains("[Memory context]").src/openhuman/agent/triage/escalation.rs (1)
187-187:⚠️ Potential issue | 🟡 MinorReplace fixed sleeps with condition-based waits
These sleeps are timing-dependent and can intermittently fail in CI under load. Please wait on a concrete predicate under
timeoutinstead.🛠️ Deterministic wait pattern
- use tokio::time::{sleep, Duration}; + use tokio::time::{timeout, Duration}; @@ - sleep(Duration::from_millis(20)).await; + timeout(Duration::from_secs(1), async { + loop { + if seen.lock().await.iter().any(|event| matches!( + event, + DomainEvent::TriggerEvaluated { decision, external_id, .. } + if decision == "drop" && external_id == "esc-1" + )) { + break; + } + tokio::task::yield_now().await; + } + }) + .await + .expect("timed out waiting for evaluated event");Also applies to: 243-243, 278-278, 321-321, 362-362
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/triage/escalation.rs` at line 187, The code imports and uses fixed sleeps (tokio::time::sleep, Duration) in escalation.rs which makes tests flaky; replace those sleeps with a condition-based wait under a timeout. Add a small helper (e.g., wait_for_condition or wait_for_predicate) that accepts an async predicate, a total timeout, and a poll interval, and implement it using tokio::time::timeout(...) around a loop that checks the predicate and awaits a short sleep between checks; then replace direct sleep(...) calls in the functions referencing escalation logic (look for uses of sleep and Duration in escalation.rs) with calls to this helper so tests wait for the concrete predicate rather than a fixed delay. Ensure the helper returns a clear Result/timeout error so callers can handle failures.
🧹 Nitpick comments (2)
src/openhuman/agent/harness/session/turn.rs (2)
1171-1179: Guard scripted provider queue exhaustion to avoid panic-based test failures.
remove(0)panics when the scripted response queue is empty, which makes failures less diagnosable than returning a typed error.Proposed fix
async fn chat( &self, request: ChatRequest<'_>, _model: &str, _temperature: f64, ) -> Result<ChatResponse> { self.requests.lock().await.push(request.messages.to_vec()); - self.responses.lock().await.remove(0) + let mut responses = self.responses.lock().await; + if responses.is_empty() { + return Err(anyhow::anyhow!( + "SequenceProvider response script exhausted" + )); + } + responses.remove(0) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/session/turn.rs` around lines 1171 - 1179, The chat method currently does self.responses.lock().await.remove(0) which will panic if the scripted response queue is empty; change it to guard against empty responses and return a typed error instead. Inside chat (function name chat, types ChatRequest/ChatResponse, fields requests/responses), after pushing the request, acquire the responses lock and check if responses.is_empty(); if empty, return an Err with a clear error (e.g., an anyhow/eyre or crate-specific error) indicating the scripted response queue is exhausted; otherwise pop or remove the first element and return it as Ok. Ensure no unwrap/remove(0) is left to avoid panic-based test failures.
1124-1124: Use a bounded wait instead of fixed sleep for async event assertions.The fixed 20ms delay can be flaky under load. Prefer waiting until expected events arrive (with timeout).
Proposed fix
- use tokio::time::{sleep, timeout, Duration}; + use tokio::time::{timeout, Duration}; @@ - sleep(Duration::from_millis(20)).await; + timeout(Duration::from_secs(1), async { + loop { + if events.lock().await.len() >= 2 { + break; + } + tokio::task::yield_now().await; + } + }) + .await + .expect("expected ToolExecutionStarted and ToolExecutionCompleted events"); let captured = events.lock().await;Also applies to: 1453-1455
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/session/turn.rs` at line 1124, Replace the fixed sleep(Duration::from_millis(20)).await used to wait for async events with a bounded wait that uses tokio::time::timeout and actively awaits the expected event or condition; specifically, where you currently call sleep(...) use timeout(Duration::from_millis(<sane-ms>), async { loop { if check_event_or_stream_has_expected_item() { break } yield to the executor (e.g., tokio::task::yield_now().await) } }).await and handle the Err(timeout) case as a test failure; apply this change for all occurrences that import or use sleep, timeout, Duration (including the spots referenced by the review: the usage at the import line and the other occurrences around the second instance).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/agent/triage/escalation.rs`:
- Around line 333-335: The test is pattern-matching a non-existent field `error`
on DomainEvent::TriggerEscalationFailed; update the match arms to use the actual
field name `reason` instead (e.g. DomainEvent::TriggerEscalationFailed {
external_id, reason } if external_id == "esc-1" &&
reason.contains("missing-agent")), and apply the same change for the other
occurrence where TriggerEscalationFailed is matched.
- Around line 189-197: The tests share a fixed external_id ("esc-1" and
similarly "esc-ack", "esc-react-fail", "esc-escalate-fail") causing cross-test
collisions; change the envelope() helper(s) to generate a unique external_id per
invocation (for example by appending a short random value or UUID) when calling
TriggerEnvelope::from_composio so each test publishes/asserts against an
isolated id; apply the same change to the other envelope helpers used in the
tests (the ones that currently return "esc-ack", "esc-react-fail",
"esc-escalate-fail") to prevent parallel-test interference.
---
Duplicate comments:
In `@src/openhuman/agent/memory_loader.rs`:
- Around line 226-230: The test function
new_enforces_minimum_limit_and_zero_budget_disables_output is misnamed because
it asserts that output is present; rename it to reflect the asserted behavior
(e.g., new_enforces_minimum_limit_and_zero_budget_allows_output) and keep the
existing test body that constructs DefaultMemoryLoader::new(0,
0.4).with_max_chars(0), calls load_context(&MockMemory, "hello"), and asserts
context.contains("[Memory context]").
In `@src/openhuman/agent/triage/escalation.rs`:
- Line 187: The code imports and uses fixed sleeps (tokio::time::sleep,
Duration) in escalation.rs which makes tests flaky; replace those sleeps with a
condition-based wait under a timeout. Add a small helper (e.g.,
wait_for_condition or wait_for_predicate) that accepts an async predicate, a
total timeout, and a poll interval, and implement it using
tokio::time::timeout(...) around a loop that checks the predicate and awaits a
short sleep between checks; then replace direct sleep(...) calls in the
functions referencing escalation logic (look for uses of sleep and Duration in
escalation.rs) with calls to this helper so tests wait for the concrete
predicate rather than a fixed delay. Ensure the helper returns a clear
Result/timeout error so callers can handle failures.
---
Nitpick comments:
In `@src/openhuman/agent/harness/session/turn.rs`:
- Around line 1171-1179: The chat method currently does
self.responses.lock().await.remove(0) which will panic if the scripted response
queue is empty; change it to guard against empty responses and return a typed
error instead. Inside chat (function name chat, types ChatRequest/ChatResponse,
fields requests/responses), after pushing the request, acquire the responses
lock and check if responses.is_empty(); if empty, return an Err with a clear
error (e.g., an anyhow/eyre or crate-specific error) indicating the scripted
response queue is exhausted; otherwise pop or remove the first element and
return it as Ok. Ensure no unwrap/remove(0) is left to avoid panic-based test
failures.
- Line 1124: Replace the fixed sleep(Duration::from_millis(20)).await used to
wait for async events with a bounded wait that uses tokio::time::timeout and
actively awaits the expected event or condition; specifically, where you
currently call sleep(...) use timeout(Duration::from_millis(<sane-ms>), async {
loop { if check_event_or_stream_has_expected_item() { break } yield to the
executor (e.g., tokio::task::yield_now().await) } }).await and handle the
Err(timeout) case as a test failure; apply this change for all occurrences that
import or use sleep, timeout, Duration (including the spots referenced by the
review: the usage at the import line and the other occurrences around the second
instance).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4aa53ed4-7cf4-45ec-ae86-16bf485770ff
📒 Files selected for processing (12)
src/openhuman/agent/harness/definition.rssrc/openhuman/agent/harness/interrupt.rssrc/openhuman/agent/harness/mod.rssrc/openhuman/agent/harness/session/builder.rssrc/openhuman/agent/harness/session/runtime.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/subagent_runner.rssrc/openhuman/agent/memory_loader.rssrc/openhuman/agent/pformat.rssrc/openhuman/agent/triage/escalation.rssrc/openhuman/agent/triage/evaluator.rssrc/openhuman/agent/triage/mod.rs
✅ Files skipped from review due to trivial changes (7)
- src/openhuman/agent/harness/interrupt.rs
- src/openhuman/agent/triage/mod.rs
- src/openhuman/agent/pformat.rs
- src/openhuman/agent/harness/mod.rs
- src/openhuman/agent/harness/definition.rs
- src/openhuman/agent/triage/evaluator.rs
- src/openhuman/agent/harness/session/builder.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- src/openhuman/agent/harness/session/runtime.rs
…provider - Introduced tests to validate the detection of missing commands in the `SelfHealingInterceptor`, ensuring proper handling of recognized and unrecognized patterns. - Added tests for the `ensure_polyfill_dir` method to confirm directory creation and path exposure. - Enhanced local AI provider tests to verify correct behavior when the service is ready and the appropriate tier is set, as well as ensuring local metadata is utilized during provider resolution. - Overall, these additions strengthen the test suite by improving coverage and reliability in self-healing and local AI functionalities.
…scalation tests - Updated the `envelope` function to accept an `external_id` parameter, improving flexibility in test scenarios. - Modified various test cases to utilize specific external IDs, enhancing clarity and ensuring accurate event assertions. - Introduced a mutex lock in local AI tests to prevent race conditions, ensuring reliable test execution. - Overall, these changes improve the robustness and maintainability of the test suite for escalation and local AI functionalities.
…ization - Eliminated unused test modules from `hooks.rs`, `memory_loader.rs`, `fork_context.rs`, `interrupt.rs`, and `parse.rs` to streamline the codebase. - Enhanced overall test organization by consolidating relevant tests into appropriate files, improving maintainability and readability. - These changes aim to simplify the test structure and focus on active test cases, ensuring a cleaner and more efficient testing environment.
- Reformatted assertions in multiple test cases for better readability by aligning them with Rust's idiomatic style. - Enhanced the structure of test cases by using line breaks and consistent indentation, improving overall clarity. - These changes aim to strengthen the test suite by making it more maintainable and easier to understand.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/openhuman/agent/triage/routing.rs (1)
408-447:⚠️ Potential issue | 🟠 MajorCache tests are masking race conditions instead of isolating them.
Line 423 accepts
Remote, and Lines 427-430 assume an empty global cache. With parallel tests sharingDECISION_CACHE, this can both hide regressions and produce flaky outcomes. Serialize cache-state tests with a dedicated test mutex, then keep strict assertions.Suggested fix
#[cfg(test)] mod tests { use super::*; use crate::openhuman::local_ai::presets::apply_preset_to_config; + static CACHE_TEST_MUTEX: tokio::sync::Mutex<()> = tokio::sync::Mutex::const_new(()); @@ #[tokio::test] async fn mark_degraded_forces_remote_on_next_resolve() { + let _guard = CACHE_TEST_MUTEX.lock().await; clear_cache().await; @@ #[tokio::test] async fn decide_with_cache_respects_ttl_window() { + let _guard = CACHE_TEST_MUTEX.lock().await; clear_cache().await; @@ - assert!(matches!(state, CacheState::Degraded | CacheState::Remote)); + assert_eq!(state, CacheState::Degraded); } #[tokio::test] async fn cache_snapshot_returns_none_when_empty_and_refreshes_expired_entries() { + let _guard = CACHE_TEST_MUTEX.lock().await; clear_cache().await; assert!(cache_snapshot().await.is_none());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/triage/routing.rs` around lines 408 - 447, These tests share the global DECISION_CACHE and currently race with other tests; add a dedicated test mutex (e.g., TEST_CACHE_LOCK: once_cell::sync::Lazy<tokio::sync::Mutex<()>>) and acquire it at the top of both decide_with_cache_respects_ttl_window and cache_snapshot_returns_none_and_refreshes_expired_entries so the cache state is serialized; keep calling clear_cache() after acquiring the lock, set DECISION_CACHE directly as you already do, and tighten the first assertion to assert_eq!(state, CacheState::Degraded) (remove the permissive matches allowing Remote) to ensure the cache hit is verified exactly; in the second test also acquire the same mutex before asserting cache_snapshot().is_none() and before setting the expired entry so the test cannot be affected by parallel tests that touch DECISION_CACHE.
♻️ Duplicate comments (1)
src/openhuman/agent/triage/escalation.rs (1)
244-245:⚠️ Potential issue | 🟡 MinorReplace timing-based sleeps with condition-based waits in async tests.
Using fixed
sleep(Duration::from_millis(20))makes these tests scheduler-sensitive and flaky under slower CI.Suggested deterministic wait pattern
- use tokio::time::{sleep, Duration}; + use tokio::time::{timeout, Duration}; @@ - sleep(Duration::from_millis(20)).await; + timeout(Duration::from_secs(1), async { + loop { + if seen.lock().await.iter().any(|event| matches!( + event, + DomainEvent::TriggerEvaluated { decision, external_id, .. } + if decision == "drop" && external_id == "esc-drop" + )) { + break; + } + tokio::task::yield_now().await; + } + }) + .await + .expect("timed out waiting for evaluated event");Also applies to: 280-281, 324-325, 366-367
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/triage/escalation.rs` around lines 244 - 245, Replace the flaky sleep(Duration::from_millis(20)).await calls with deterministic condition-based waits: stop using the literal sleep and instead await a concrete signal (e.g., a tokio::sync::Notify, oneshot channel, or poll a shared atomic/flag inside a tokio::time::timeout loop) that tests can trigger when the expected state/event occurs; specifically, replace each occurrence of sleep(Duration::from_millis(20)).await (and the other similar sleeps at the noted locations) with code that uses tokio::time::timeout to bound waiting and either await notify.notified().await or repeatedly check the condition in a short interval until satisfied, failing the test on timeout to avoid scheduler-dependent flakes.
🧹 Nitpick comments (4)
src/openhuman/local_ai/mod.rs (1)
3-5: Add a short rustdoc for the test mutex’s intent.This global test lock coordinates cross-module shared-state mutation; a brief comment will make that coupling explicit.
Suggested tweak
+/// Test-only global lock to serialize tests that mutate shared Local AI service state. #[cfg(test)] pub(crate) static LOCAL_AI_TEST_MUTEX: once_cell::sync::Lazy<std::sync::Mutex<()>> = once_cell::sync::Lazy::new(|| std::sync::Mutex::new(()));As per coding guidelines "
{src/**/*.rs,.github/**/*.md,docs/**/*.md}: Add concise rustdoc / code comments where the flow is not obvious".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/local_ai/mod.rs` around lines 3 - 5, Add a concise rustdoc above the static LOCAL_AI_TEST_MUTEX explaining that it is a test-only global mutex used to coordinate and serialize cross-module shared-state mutation in tests; mention that it exists to avoid test flakiness by preventing concurrent access to the shared resources and that it is only compiled under #[cfg(test)]. Update the comment to live immediately above the declaration of LOCAL_AI_TEST_MUTEX so readers of the mod.rs can immediately see the intent and scope.src/openhuman/voice/postprocess.rs (1)
179-184: Make shared-state restoration panic-safe in this test.If the async call panics, the previous state is never restored, which can leak state into later tests.
Panic-safe restore pattern
- let result = rt.block_on(cleanup_transcription(&config, "um hello uh world", None)); - service.status.lock().state = previous; + let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { + rt.block_on(cleanup_transcription(&config, "um hello uh world", None)) + })); + service.status.lock().state = previous; + let result = result.expect("cleanup_transcription panicked");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/voice/postprocess.rs` around lines 179 - 184, The test currently saves previous = service.status.lock().state and then resets it after calling rt.block_on, which leaks state if the async call panics; make restoration panic-safe by creating a RAII guard that stores a reference to service (local_ai::global(&config) result) and previous state and restores service.status.lock().state in its Drop impl, construct that guard immediately after computing previous (e.g., let _restore = RestoreGuard::new(&service, previous)), then call rt.block_on(cleanup_transcription(...)); the guard's Drop will always run and restore the state even if the call panics.tests/agent_memory_loader_public.rs (1)
24-35: UselimitinScriptedMemory::recallto keep the test double faithful.At Line 27,
_limitis ignored, so these tests can’t catch regressions whereDefaultMemoryLoaderpasses a wrong recall limit.♻️ Proposed refinement
- async fn recall( - &self, - query: &str, - _limit: usize, - _session_id: Option<&str>, - ) -> Result<Vec<MemoryEntry>> { - if query.contains("working.user") { - Ok(self.working.clone()) - } else { - Ok(self.primary.clone()) - } - } + async fn recall( + &self, + query: &str, + limit: usize, + _session_id: Option<&str>, + ) -> Result<Vec<MemoryEntry>> { + let source = if query.contains("working.user") { + &self.working + } else { + &self.primary + }; + Ok(source.iter().take(limit).cloned().collect()) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/agent_memory_loader_public.rs` around lines 24 - 35, The test double's recall method ignores the _limit parameter; update ScriptedMemory::recall (the async fn recall) to respect the provided limit by returning at most limit entries from either self.working or self.primary (preserve original ordering), and handle limit == 0 or limit > len gracefully; ensure the method uses the limit argument (not _limit) so tests can detect incorrect recall limits from DefaultMemoryLoader.tests/agent_builder_public.rs (1)
12-116: Consider centralizing repeated test doubles into shared test utilities.
StubProvider/StubMemoryare duplicated across public test files, which can drift when trait contracts change.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/agent_builder_public.rs` around lines 12 - 116, The StubProvider, StubTool, and StubMemory test doubles are duplicated and should be centralized: extract these structs and their async_trait impls (StubProvider::chat_with_system, ::chat; StubTool::name/description/parameters_schema/execute; StubMemory::store/recall/get/list/forget/count/health_check/name) into a single shared test support module (e.g., tests::support or tests::utils) and re-export them for use by public tests; update test files to import the shared module instead of redefining the doubles and ensure the module exposes any needed trait imports (Provider, Tool, Memory, MemoryCategory, ChatRequest, ChatResponse, ToolResult, MemoryEntry) and async_trait usage so trait contract changes only need one update.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/agent/harness/parse.rs`:
- Around line 768-777: The test
structured_tool_call_and_history_helpers_round_trip_expected_shapes currently
ignores ToolCall.id even though parse_structured_tool_calls receives it; update
the test to assert that the parsed representation preserves the call id (e.g., a
tool_call_id field) by passing a ToolCall with a known id and asserting
parsed[0].tool_call_id (or the shared parsed-call model field) equals "call-1";
if parse_structured_tool_calls does not yet populate that field, modify
parse_structured_tool_calls to carry ToolCall.id into the parsed struct so
native tool-result correlation is preserved.
In `@src/openhuman/agent/triage/routing.rs`:
- Around line 404-405: Restore the missing lower-bound check on TTL: update the
assertion that validates snap.ttl_remaining_ms so it ensures the value is
strictly positive and does not exceed CACHE_TTL (e.g., assert that
snap.ttl_remaining_ms > 0 and snap.ttl_remaining_ms <= CACHE_TTL.as_millis()).
Locate the assertion referencing snap.ttl_remaining_ms and CACHE_TTL in
routing.rs and add the lower-bound condition to enforce the test invariant.
In `@tests/agent_harness_public.rs`:
- Around line 276-282: The loop waiting for hook completion should register the
Notify's Notified future before checking the condition to avoid a lost-wakeup
race; change the loop in tests/agent_harness_public.rs to call let notified =
notify.notified() (or equivalent) first, then check calls.lock().len() and await
notified.await only if the condition still isn't met, using the existing calls
and notify symbols so the test reliably observes notifications without
intermittent timeouts.
---
Outside diff comments:
In `@src/openhuman/agent/triage/routing.rs`:
- Around line 408-447: These tests share the global DECISION_CACHE and currently
race with other tests; add a dedicated test mutex (e.g., TEST_CACHE_LOCK:
once_cell::sync::Lazy<tokio::sync::Mutex<()>>) and acquire it at the top of both
decide_with_cache_respects_ttl_window and
cache_snapshot_returns_none_and_refreshes_expired_entries so the cache state is
serialized; keep calling clear_cache() after acquiring the lock, set
DECISION_CACHE directly as you already do, and tighten the first assertion to
assert_eq!(state, CacheState::Degraded) (remove the permissive matches allowing
Remote) to ensure the cache hit is verified exactly; in the second test also
acquire the same mutex before asserting cache_snapshot().is_none() and before
setting the expired entry so the test cannot be affected by parallel tests that
touch DECISION_CACHE.
---
Duplicate comments:
In `@src/openhuman/agent/triage/escalation.rs`:
- Around line 244-245: Replace the flaky sleep(Duration::from_millis(20)).await
calls with deterministic condition-based waits: stop using the literal sleep and
instead await a concrete signal (e.g., a tokio::sync::Notify, oneshot channel,
or poll a shared atomic/flag inside a tokio::time::timeout loop) that tests can
trigger when the expected state/event occurs; specifically, replace each
occurrence of sleep(Duration::from_millis(20)).await (and the other similar
sleeps at the noted locations) with code that uses tokio::time::timeout to bound
waiting and either await notify.notified().await or repeatedly check the
condition in a short interval until satisfied, failing the test on timeout to
avoid scheduler-dependent flakes.
---
Nitpick comments:
In `@src/openhuman/local_ai/mod.rs`:
- Around line 3-5: Add a concise rustdoc above the static LOCAL_AI_TEST_MUTEX
explaining that it is a test-only global mutex used to coordinate and serialize
cross-module shared-state mutation in tests; mention that it exists to avoid
test flakiness by preventing concurrent access to the shared resources and that
it is only compiled under #[cfg(test)]. Update the comment to live immediately
above the declaration of LOCAL_AI_TEST_MUTEX so readers of the mod.rs can
immediately see the intent and scope.
In `@src/openhuman/voice/postprocess.rs`:
- Around line 179-184: The test currently saves previous =
service.status.lock().state and then resets it after calling rt.block_on, which
leaks state if the async call panics; make restoration panic-safe by creating a
RAII guard that stores a reference to service (local_ai::global(&config) result)
and previous state and restores service.status.lock().state in its Drop impl,
construct that guard immediately after computing previous (e.g., let _restore =
RestoreGuard::new(&service, previous)), then call
rt.block_on(cleanup_transcription(...)); the guard's Drop will always run and
restore the state even if the call panics.
In `@tests/agent_builder_public.rs`:
- Around line 12-116: The StubProvider, StubTool, and StubMemory test doubles
are duplicated and should be centralized: extract these structs and their
async_trait impls (StubProvider::chat_with_system, ::chat;
StubTool::name/description/parameters_schema/execute;
StubMemory::store/recall/get/list/forget/count/health_check/name) into a single
shared test support module (e.g., tests::support or tests::utils) and re-export
them for use by public tests; update test files to import the shared module
instead of redefining the doubles and ensure the module exposes any needed trait
imports (Provider, Tool, Memory, MemoryCategory, ChatRequest, ChatResponse,
ToolResult, MemoryEntry) and async_trait usage so trait contract changes only
need one update.
In `@tests/agent_memory_loader_public.rs`:
- Around line 24-35: The test double's recall method ignores the _limit
parameter; update ScriptedMemory::recall (the async fn recall) to respect the
provided limit by returning at most limit entries from either self.working or
self.primary (preserve original ordering), and handle limit == 0 or limit > len
gracefully; ensure the method uses the limit argument (not _limit) so tests can
detect incorrect recall limits from DefaultMemoryLoader.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fa377f87-61ee-4e60-8404-b38e6d3ad1d8
📒 Files selected for processing (14)
src/openhuman/agent/harness/fork_context.rssrc/openhuman/agent/harness/interrupt.rssrc/openhuman/agent/harness/parse.rssrc/openhuman/agent/harness/self_healing.rssrc/openhuman/agent/hooks.rssrc/openhuman/agent/memory_loader.rssrc/openhuman/agent/triage/escalation.rssrc/openhuman/agent/triage/routing.rssrc/openhuman/local_ai/mod.rssrc/openhuman/voice/postprocess.rstests/agent_builder_public.rstests/agent_harness_public.rstests/agent_memory_loader_public.rstests/agent_multimodal_public.rs
💤 Files with no reviewable changes (3)
- src/openhuman/agent/memory_loader.rs
- src/openhuman/agent/harness/fork_context.rs
- src/openhuman/agent/harness/interrupt.rs
✅ Files skipped from review due to trivial changes (1)
- src/openhuman/agent/harness/self_healing.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- src/openhuman/agent/hooks.rs
| fn structured_tool_call_and_history_helpers_round_trip_expected_shapes() { | ||
| let tool_calls = vec![ToolCall { | ||
| id: "call-1".into(), | ||
| name: "echo".into(), | ||
| arguments: "{\"value\":\"hello\"}".into(), | ||
| }]; | ||
|
|
||
| let parsed = parse_structured_tool_calls(&tool_calls); | ||
| assert_eq!(parsed.len(), 1); | ||
| assert_eq!(parsed[0].arguments, serde_json::json!({ "value": "hello" })); |
There was a problem hiding this comment.
ToolCall.id is not validated (and is currently dropped) in structured parse coverage.
This test only checks arguments, but parse_structured_tool_calls takes ToolCall { id, ... } and discards id. Given native flows rely on call IDs for tool-result correlation, this path should preserve and assert tool_call_id behavior explicitly (or use the shared parsed-call model that includes it).
Suggested fix sketch
pub(crate) struct ParsedToolCall {
pub name: String,
pub arguments: serde_json::Value,
+ pub tool_call_id: Option<String>,
} pub(crate) fn parse_structured_tool_calls(tool_calls: &[ToolCall]) -> Vec<ParsedToolCall> {
tool_calls
.iter()
.map(|call| ParsedToolCall {
name: call.name.clone(),
arguments: serde_json::from_str::<serde_json::Value>(&call.arguments)
.unwrap_or_else(|_| serde_json::Value::Object(serde_json::Map::new())),
+ tool_call_id: Some(call.id.clone()),
})
.collect()
} let parsed = parse_structured_tool_calls(&tool_calls);
assert_eq!(parsed.len(), 1);
assert_eq!(parsed[0].arguments, serde_json::json!({ "value": "hello" }));
+assert_eq!(parsed[0].tool_call_id.as_deref(), Some("call-1"));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/harness/parse.rs` around lines 768 - 777, The test
structured_tool_call_and_history_helpers_round_trip_expected_shapes currently
ignores ToolCall.id even though parse_structured_tool_calls receives it; update
the test to assert that the parsed representation preserves the call id (e.g., a
tool_call_id field) by passing a ToolCall with a known id and asserting
parsed[0].tool_call_id (or the shared parsed-call model field) equals "call-1";
if parse_structured_tool_calls does not yet populate that field, modify
parse_structured_tool_calls to carry ToolCall.id into the parsed struct so
native tool-result correlation is preserved.
| assert!(snap.ttl_remaining_ms <= CACHE_TTL.as_millis()); | ||
| } |
There was a problem hiding this comment.
Restore the lower-bound TTL check here.
Line 404 now only checks <= CACHE_TTL, so an already-expired (0) TTL would still pass despite the test’s stated invariant.
Suggested fix
- assert!(snap.ttl_remaining_ms <= CACHE_TTL.as_millis());
+ assert!(
+ snap.ttl_remaining_ms > 0 && snap.ttl_remaining_ms <= CACHE_TTL.as_millis()
+ );📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert!(snap.ttl_remaining_ms <= CACHE_TTL.as_millis()); | |
| } | |
| assert!( | |
| snap.ttl_remaining_ms > 0 && snap.ttl_remaining_ms <= CACHE_TTL.as_millis() | |
| ); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/triage/routing.rs` around lines 404 - 405, Restore the
missing lower-bound check on TTL: update the assertion that validates
snap.ttl_remaining_ms so it ensures the value is strictly positive and does not
exceed CACHE_TTL (e.g., assert that snap.ttl_remaining_ms > 0 and
snap.ttl_remaining_ms <= CACHE_TTL.as_millis()). Locate the assertion
referencing snap.ttl_remaining_ms and CACHE_TTL in routing.rs and add the
lower-bound condition to enforce the test invariant.
| tokio::time::timeout(std::time::Duration::from_secs(1), async { | ||
| loop { | ||
| if calls.lock().len() == 2 { | ||
| break; | ||
| } | ||
| notify.notified().await; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does tokio::sync::Notify::notify_waiters buffer notifications sent before a task awaits notified(), and what pattern is recommended to avoid missed wakeups in loops?
💡 Result:
No, tokio::sync::Notify::notify_waiters does not buffer notifications for tasks that await notified after the call. It only notifies tasks that are already registered waiters at the time of the call (i.e., Notified futures that have been created). Unlike notify_one, it does not store a permit for future waiters. To avoid missed wakeups in loops (e.g., when using Notify to signal that a condition like data availability might be true), use this pattern: tokio::pin!(future); loop { future.as_mut.enable; // Register for notify_one wakeups before checking condition to prevent lost notifications if condition_is_true { // e.g., try_recv or similar non-blocking check break value; } future.as_mut.await; // Wait for notification future.set(notify.notified); // Reset for next iteration }
Citations:
- 1: https://docs.rs/tokio/latest/tokio/sync/struct.Notify.html
- 2: https://github.com/tokio-rs/tokio/blob/master/tokio/src/sync/notify.rs
- 3: https://doc.servo.org/tokio/sync/notify/struct.Notified.html
🏁 Script executed:
# First, verify the file and view the actual code at the specified lines
head -n 290 tests/agent_harness_public.rs | tail -n 30Repository: tinyhumansai/openhuman
Length of output: 834
🏁 Script executed:
# Search for RecordingHook definition and how notify is used
rg -n "RecordingHook" tests/agent_harness_public.rs -A 15 | head -50Repository: tinyhumansai/openhuman
Length of output: 1394
🏁 Script executed:
# Verify the surrounding context to understand the full async test structure
sed -n '250,290p' tests/agent_harness_public.rsRepository: tinyhumansai/openhuman
Length of output: 1207
Fix potential lost-wakeup race in hook completion wait loop.
At lines 277-281, the current pattern checks the condition before awaiting the notification. If notify_waiters() is called between the check and the await, the wakeup is lost and the test may timeout intermittently. Tokio's Notify does not buffer notifications for tasks that call notified() after the notification is sent.
Safer wait pattern
tokio::time::timeout(std::time::Duration::from_secs(1), async {
loop {
+ let notified = notify.notified();
if calls.lock().len() == 2 {
break;
}
- notify.notified().await;
+ notified.await;
}
})Register the Notified future before checking the condition to prevent the race.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| tokio::time::timeout(std::time::Duration::from_secs(1), async { | |
| loop { | |
| if calls.lock().len() == 2 { | |
| break; | |
| } | |
| notify.notified().await; | |
| } | |
| tokio::time::timeout(std::time::Duration::from_secs(1), async { | |
| loop { | |
| let notified = notify.notified(); | |
| if calls.lock().len() == 2 { | |
| break; | |
| } | |
| notified.await; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/agent_harness_public.rs` around lines 276 - 282, The loop waiting for
hook completion should register the Notify's Notified future before checking the
condition to avoid a lost-wakeup race; change the loop in
tests/agent_harness_public.rs to call let notified = notify.notified() (or
equivalent) first, then check calls.lock().len() and await notified.await only
if the condition still isn't met, using the existing calls and notify symbols so
the test reliably observes notifications without intermittent timeouts.
…nscript handling - Updated the `key_bytes_from_string` function to use a more concise containment check for special characters. - Changed the message writing in the `write_transcript` function to utilize `writeln!` for better formatting. - Enhanced the condition in `latest_in_dir` to use `is_none_or` for improved readability and clarity. - These changes aim to streamline code and enhance maintainability across the API and transcript handling modules.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/openhuman/accessibility/permissions.rs (1)
74-90:⚠️ Potential issue | 🟠 MajorAdd debug/trace diagnostics for this changed permission-request flow.
Line 74-90 executes an unsafe external-call path without entry/exit, branch, or call-result logging, which makes failures hard to trace.
Proposed logging patch
#[cfg(target_os = "macos")] pub fn request_accessibility_access() { + log::debug!("[permissions] requesting macOS accessibility access"); unsafe { let keys = [kAXTrustedCheckOptionPrompt]; let values = [kCFBooleanTrue]; let options = CFDictionaryCreate( kCFAllocatorDefault, keys.as_ptr(), values.as_ptr(), 1, std::ptr::null(), std::ptr::null(), ); - let _ = AXIsProcessTrustedWithOptions(options); + let trusted = AXIsProcessTrustedWithOptions(options); + log::debug!( + "[permissions] AXIsProcessTrustedWithOptions completed trusted={trusted} options_null={}", + options.is_null() + ); if !options.is_null() { CFRelease(options); + log::debug!("[permissions] released CoreFoundation options dictionary"); + } else { + log::debug!("[permissions] CoreFoundation options dictionary was null; skipping release"); } } + log::debug!("[permissions] macOS accessibility request flow complete"); }As per coding guidelines,
src/**/*.rs: “Add substantial, development-oriented logs on new/changed flows; include logs at entry/exit points, branch decisions, external calls, retries/timeouts, state transitions, and error handling paths”.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/accessibility/permissions.rs` around lines 74 - 90, In request_accessibility_access(), add development-oriented logs: log at function entry and exit, before and after the external unsafe calls (CFDictionaryCreate and AXIsProcessTrustedWithOptions), and on branch decisions (e.g., whether options is null and whether AXIsProcessTrustedWithOptions returned true/false). Use trace/debug for entry/exit and pre-call state, and error/warn for unexpected/null results; include the pointer value of options and the boolean result from AXIsProcessTrustedWithOptions to aid tracing. Ensure logs are added around CFRelease too so release paths are observable.src/openhuman/memory/embeddings.rs (1)
143-145:⚠️ Potential issue | 🟠 MajorPreserve
ORT_LIB_LOCATIONprecedence after resolvingruntime_lib.At Line 143–145, once
runtime_libexists andORT_DYLIB_PATHis set, execution continues into Linux fallback (Line 149+), which can overwrite the explicit location with a system path. That breaks the documented priority order and can load the wrong ONNX runtime.Proposed fix
if runtime_lib.exists() { env::set_var("ORT_DYLIB_PATH", runtime_lib); + return; }Also applies to: 149-160
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/memory/embeddings.rs` around lines 143 - 145, The code currently sets ORT_DYLIB_PATH from runtime_lib but then continues into the Linux fallback which can overwrite it; change the control flow so ORT_LIB_LOCATION env var takes precedence and if you successfully set ORT_DYLIB_PATH from runtime_lib you skip the Linux fallback. Concretely, check env::var("ORT_LIB_LOCATION") first and return/skip further processing if present, and after calling env::set_var("ORT_DYLIB_PATH", runtime_lib) immediately exit the function or branch (do not fall through to the Linux fallback block), so the runtime_lib assignment cannot be overridden by the fallback code.
🧹 Nitpick comments (1)
src/openhuman/memory/embeddings.rs (1)
123-163: Add debug/trace logs for ORT dylib path branch decisions.This changed resolution flow would benefit from branch-level diagnostics (selected source, candidate path, fallback used) to make runtime failures easier to triage.
As per coding guidelines:
src/**/*.rs: "Add substantial, development-oriented logs on new/changed flows; include logs at ... branch decisions ... and error handling paths."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/memory/embeddings.rs` around lines 123 - 163, The function ensure_fastembed_ort_dylib_path lacks diagnostic logging for its branch decisions; add development-level logs (using the project's logger or tracing crate) that record: entry into ensure_fastembed_ort_dylib_path and current ORT_DYLIB_PATH if set, when ORT_LIB_LOCATION is present and the inspected candidate path (and whether it was a file), which platform runtime_lib path was checked and whether it existed, and which system fallback candidate (from the Linux array) was selected or if none matched; instrument checks around env::var_os("ORT_DYLIB_PATH"), env::var_os("ORT_LIB_LOCATION"), the candidate.is_file() branch, the runtime_lib.exists() branch, and the Linux fallback loop so each branch decision and final action (setting ORT_DYLIB_PATH or leaving it unset) is logged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/config/schema/load.rs`:
- Around line 513-517: The binding `config` is currently immutable but
`Config::apply_env_overrides(&mut self)` requires a mutable receiver; make the
`config` binding mutable (e.g., change the `let config` declaration to `let mut
config`) so that the subsequent call to `config.apply_env_overrides()` compiles;
update only the declaration where `Config { config_path: config_path.clone(),
workspace_dir, ..Default::default() }` is created and leave the rest unchanged.
In `@src/openhuman/memory/conversations/bus.rs`:
- Line 149: The compile error is caused by using &Path in the function signature
(the parameter named workspace_dir) without importing Path; add a top-level
import for std::path::Path (e.g. use std::path::Path;) so the symbol Path is in
scope for the function signature that references workspace_dir: &Path.
---
Outside diff comments:
In `@src/openhuman/accessibility/permissions.rs`:
- Around line 74-90: In request_accessibility_access(), add development-oriented
logs: log at function entry and exit, before and after the external unsafe calls
(CFDictionaryCreate and AXIsProcessTrustedWithOptions), and on branch decisions
(e.g., whether options is null and whether AXIsProcessTrustedWithOptions
returned true/false). Use trace/debug for entry/exit and pre-call state, and
error/warn for unexpected/null results; include the pointer value of options and
the boolean result from AXIsProcessTrustedWithOptions to aid tracing. Ensure
logs are added around CFRelease too so release paths are observable.
In `@src/openhuman/memory/embeddings.rs`:
- Around line 143-145: The code currently sets ORT_DYLIB_PATH from runtime_lib
but then continues into the Linux fallback which can overwrite it; change the
control flow so ORT_LIB_LOCATION env var takes precedence and if you
successfully set ORT_DYLIB_PATH from runtime_lib you skip the Linux fallback.
Concretely, check env::var("ORT_LIB_LOCATION") first and return/skip further
processing if present, and after calling env::set_var("ORT_DYLIB_PATH",
runtime_lib) immediately exit the function or branch (do not fall through to the
Linux fallback block), so the runtime_lib assignment cannot be overridden by the
fallback code.
---
Nitpick comments:
In `@src/openhuman/memory/embeddings.rs`:
- Around line 123-163: The function ensure_fastembed_ort_dylib_path lacks
diagnostic logging for its branch decisions; add development-level logs (using
the project's logger or tracing crate) that record: entry into
ensure_fastembed_ort_dylib_path and current ORT_DYLIB_PATH if set, when
ORT_LIB_LOCATION is present and the inspected candidate path (and whether it was
a file), which platform runtime_lib path was checked and whether it existed, and
which system fallback candidate (from the Linux array) was selected or if none
matched; instrument checks around env::var_os("ORT_DYLIB_PATH"),
env::var_os("ORT_LIB_LOCATION"), the candidate.is_file() branch, the
runtime_lib.exists() branch, and the Linux fallback loop so each branch decision
and final action (setting ORT_DYLIB_PATH or leaving it unset) is logged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 51d2bfd9-bccf-4fce-bc5e-867d022e873a
📒 Files selected for processing (11)
src/api/rest.rssrc/core/cli.rssrc/openhuman/accessibility/permissions.rssrc/openhuman/agent/agents/mod.rssrc/openhuman/agent/harness/session/transcript.rssrc/openhuman/composio/periodic.rssrc/openhuman/composio/types.rssrc/openhuman/config/schema/load.rssrc/openhuman/dev_paths.rssrc/openhuman/memory/conversations/bus.rssrc/openhuman/memory/embeddings.rs
💤 Files with no reviewable changes (1)
- src/core/cli.rs
✅ Files skipped from review due to trivial changes (5)
- src/openhuman/composio/types.rs
- src/openhuman/agent/agents/mod.rs
- src/api/rest.rs
- src/openhuman/dev_paths.rs
- src/openhuman/composio/periodic.rs
2291c8d to
09179c5
Compare
…bility - Removed unnecessary whitespace in `run_voice_server_command` for cleaner code. - Simplified directory search logic in `bundled_openclaw_prompts_dir` by using `find` instead of a loop. - Updated `request_accessibility_access` to use direct references for keys and values, enhancing clarity. - Improved documentation formatting in `mod.rs` and `types.rs` for better consistency. - Refactored `Config` initialization in `load.rs` to use struct update syntax for clarity. - Added `#[allow(clippy::too_many_arguments)]` annotations in multiple functions to address linter warnings. - Enhanced type definitions and function signatures for better type safety and readability in various modules. - Overall, these changes aim to improve code maintainability and readability across the project.
…ippy checks - Modified the GitHub Actions workflow to run clippy with warnings treated as errors for the `openhuman` package. - Enhanced the pre-push hook to include clippy checks, ensuring code quality before pushing changes. - Updated package.json to define a new script for running clippy, integrating it into the format check process. - These changes aim to improve code quality and maintainability by enforcing stricter linting rules.
- Streamlined the condition in the `key_bytes_from_string` function to improve readability by consolidating the if statement into a single line. - This change enhances code clarity while maintaining the original functionality of the key validation process.
- Modified the `format:check` script in `package.json` to exclude the clippy check, streamlining the formatting process. - This change simplifies the formatting workflow while maintaining the integrity of Rust formatting checks.
- Improved documentation across various core functions, including `build_registered_controllers`, `run_from_cli_args`, and `dispatch`, to clarify their purpose and usage. - Streamlined comments to provide clearer guidance on the flow of operations and error handling. - Enhanced the structure of the `EventBus` and `NativeRegistry` to improve readability and maintainability. - Overall, these changes aim to improve code clarity and facilitate easier navigation and understanding of the core components.
- Adjusted the import statements in `engine.rs` to improve organization and readability. - Moved the macOS-specific import of `validate_focused_target` to a more appropriate location and ensured consistent ordering of imports. - These changes aim to enhance code clarity and maintainability within the core module.
- Introduced a new `WebChannelEvent` struct to standardize event payloads for chat-related activities, including fields for event name, client ID, thread ID, request ID, and optional response details. - Improved documentation for the `attach_socketio` function, clarifying its role in setting up Socket.IO event handlers and the associated chat logic. - Removed unused structs and streamlined the event handling process to improve code clarity and maintainability across the core module.
…onsistency - Refactored the Socket.IO event handlers in `attach_socketio` to improve readability by standardizing the formatting and structure of the code. - Enhanced the organization of the event handling logic for `rpc:request`, `chat:start`, and `chat:cancel` events, making it easier to follow the flow of operations. - These changes aim to improve code maintainability and facilitate easier navigation within the Socket.IO integration.
- Introduced `SocketRpcRequest`, `ChatStartPayload`, and `ChatCancelPayload` structs to facilitate handling of Socket.IO events related to chat functionality. - These additions enhance the structure and clarity of the event payloads, improving the maintainability of the Socket.IO integration.
- Eliminated the `json_type_name` function from `socketio.rs` as it was not utilized in the current codebase. - This change helps to clean up the code and improve maintainability by removing unnecessary functions.
- Modified the clippy command in the GitHub Actions workflow to remove the `-D warnings` flag for the `openhuman` package, allowing warnings to be displayed without failing the build. - This change aims to improve the development experience by providing more flexibility during code analysis while still encouraging code quality.
- Eliminated the clippy command from the pre-push hook to streamline the pre-push checks. - Updated the failure message to reflect the removal of clippy, focusing on format, lint, TypeScript, and Rust errors only. - This change simplifies the pre-push process while maintaining essential checks for code quality.
- Introduced conditional imports for macOS in the tests module of `engine.rs` to support platform-specific functionality. - This change improves the test setup for macOS environments, ensuring compatibility and enhancing overall test coverage.
- Modified the `execute_tool_call` method calls in multiple test cases to include a second parameter, improving the accuracy of the tests. - This change ensures that the tests reflect the latest method signature and enhances the reliability of the test outcomes.
Summary
src/openhuman/agent/harness/session/turn.rs, including transcript resume, full tool-cycle turns, max-iteration failure, hook dispatch, and inline tool-result budgeting.src/openhuman/agent/harness/session/runtime.rs, including accessors, history reset, error sanitization, overlap detection, and native tool-call persistence helpers.src/openhuman/agent/harness/tool_loop.rs, including provider failures, tool execution failures, supervised auto-approval, default iteration fallback, and native tool result persistence.Problem
openhuman::agenttree still had large untested sections in the harness hot path, especially around turn execution, runtime helpers, and the legacy tool loop.Solution
turn.rsso the session lifecycle coverage now includes more of the real runtime path.Submission Checklist
cargo test --lib 'openhuman::agent::'and targeted harness test modulesImpact
src/openhuman/agentincreased, with the main runtime-heavy harness files materially improved:session/turn.rsto87.81%tool_loop.rsto89.52%session/runtime.rsto89.33%Related
openhuman::agentcoverage viamemory_loader,triage/escalation,interrupt, andtriage/routingSummary by CodeRabbit
Release Notes
New Features
Improvements
Tests